Fujitsu Laboratories Trec7 Report 2 System Description 2.1 Overall 2.2 the Search System Tera

نویسندگان

  • Isao Namba
  • Nobuyuki Igata
  • Hisayuki Horai
  • Kiyoshi Nitta
  • Kunio Matsui
چکیده

In our rst participation in TREC, our focus was on improving the basic ranking systems and applying text clustering techniques for query expansion. We tested a variety of techiniques including reference measures, passage retrieval, and data fusion for the basic ranking systems. Some techiniques were used in the o cial run, others were not used because of time limitations. We applied the text clustering techiniques for query expansion with a text clustering engine. Clustering base query expansion uses the top N best text clusters from the top 1000 documents instead of just using the top N documents. Clustering base query expansion produces better results than simple query expansion based on passage retrieval. We submitted three runs, Flab7at , Flab7ad, and Flab7atE. Flab7at is combination of ranking and query expansion by clustering the top 1000 documents on the title eld, Flab7ad is combination of ranking and query expansion by clustering on the description eld, and Flab7atE is combination of ranking with Boolean (existence) operators and query expansion by passage retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fujitsu Laboratories Trec8 Report 1 System Description 1.0.1 Tera 2 Common Processing

This year a Fujitsu Laboratory team participated in three tracks:that is ad hoc, small web track, and large web track. As basic techiniques, we compared four popular stemmers, and we made simple removing stop pattern techniques for TREC queries. For the ad hoc task, and small web track, we used the same techiniques. We experimented with area weighting, co-occurence boosting, bi-gram utlization,...

متن کامل

Fujitsu Laboratories TREC2001 Report

This year a Fujitsu Laboratory team participated in web tracks. Both for ad hoc task, and entry point search task, we combined the score of normal ranking search and that of page ranking techniques. For ad hoc style task, the eect of page ranking was very limitted. We only got very little improvement for title eld search, and the page rank was not eective for description, and narrative eld sear...

متن کامل

Fujitsu Laboratories Trec7 Report 2 System Description 2.1 Overall 2.2 the Search System Teraa

1 Abstract In our rst participation in TREC, our focus was on improving the basic ranking systems and applying text clustering techniques for query expansion. We tested a variety of techiniques including reference measures, passage retrieval, and data fusion for the basic ranking systems. Some te-chiniques were used in the oocial run, others were not used because of time limitations. We applied...

متن کامل

Fujitsu Laboratories Trec9 Report 1 System Description 2 Common Processing 2.1 Indexing/query Processing 2.1.1 Indexing Vocabulary 2.1.2 Stemmer 2.1.4 Stop Word List for Query Processing

This year a Fujitsu Laboratory team participated in web tracks. For TREC9 we experimented passage retrieval which is expected to be e ective for Web pages which contain more than one topic. To split document into passages, we used NLP based paragrah detecting program, not by xed (variable) window size. But it did not produce better result for TREC9 Web data. For indexing large web data faster, ...

متن کامل

CEAX’s Learning Support System to Explore Cultural Heritage Objects without Keyword Search

Taizo Yamada, Kenro Aihara, Noriko Kando, Satoko Fujisawa, Yusuke Uehara, Takayuki Baba, Shigemi Nagata, Takashi Tojo, Yuko Hiroshima and Jun Adachi 1 National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan 2 Dept. of Informatics, the Graduate University for Advanced Studies, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan 3 Fujitsu Laboratories Ltd., 1-1 Kamikodanaka 4, Na...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999